PH345: Winter 2025
Representation of numbers should equal quantities represented
Representation of numbers should equal quantities represented
What the plot has: \(x/y = 257.87/147\) (numbers proportional to diameter or radius)
What the plot should have: \(x^2/y^2 = 257.87/147\) (numbers proportional to area)
Representation of numbers should equal quantities represented
Demonstration of how dynamite plots do not give an accurate representation of the data’s distribution; A and C show dynamite plots; B and D show ‘beeswarm’ plots, a type of univariate scatter plot; A and B both represent the same dataset—‘Dataset 1’, and C and D represent another—‘Dataset 2’; the dynamite plots A and C are identical, even though the Dataset 1 and Dataset 2 are vastly different; B and D give a good representation of the two different datasets, allowing the reader to note that although these datasets have the same mean and standard error, they have vastly different distributions
Figure 1, Dogget and Way, et al. (2024)
Use clear, detailed, and thorough labeling, especially if there is risk of distortion or ambiguity
How did the number of gun deaths change after introduction of Stand Your Ground law?
Show data variation, not design variation
Numbers on left-hand side are uninformative because each survey has different sample size
Cartoon people are redundant
Proportions would be impossible to see without annotations
Difficult to compare across years
Obese people in the Canary Islands in 2004, 2009 and 2015. Pink area shows the proportion of people who are obese, while grey area is related to non-obese people. The percentages refer to the total number of people of their respective group.
Figure 1, Hernández-Yumar, et al. (2019)
Show data variation, not design variation
Size of cartoon women is growing proportional to height
See also: Representation of numbers should equal quantities represented
Show data variation, not design variation
Classification of [Transcription factor binding sites] Regions
Figure 1, Cawley, et al. (2004)
In time-series displays of money, using inflation-adjusted units
Year-by-year changes in core funding in the UK relative to year 2010. Not adjusted for inflation
In time-series displays of money, using inflation-adjusted units
Year-by-year changes in core funding in the UK relative to year 2010, after adjusting for inflation
In time-series displays of money, using inflation-adjusted units
Year-by-year changes in core funding per capita in the UK relative to year 2010, after adjusting for inflation
Number of dimensions (encodings) should not exceed number of datapoints
Empirical coverage of CIs for the relative-risk parameter b of haplotype 01100. Results are based on 10,000 simulated data sets with the same haplotype frequencies as the FUSION data.
Figure 1, Epstein and Satten (2003)
Number of dimensions (encodings) should not exceed number of datapoints
Departure from Hardy-Weinberg equilibrium under additive model (top) and multiplicative model (bottom). The authors note: “For a multiplicative model, [DHW] is equal to 0.”
Figure 1C and 1D, Wittke-Thompson, et al. (2005)
Data visualization expert
Author of “Numbersense” and “Numbers Rule Your World”
Writes Junk Charts blog
Giannarelli, et al. (2023)
# hint 1: use the `labels = scales::dollar` argument in `scale_y_continuous` to format the y-axis as dollars
# hint 2: if you want to 'zoom in' on a plot, *don't* use the `limits` argument
# in `scale_y_continuous`. Doing so will actually drop the data and potentially
# change the plot itself. Use `coord_cartesian` insteadNo Spice: Make an approximate version of the bar chart on slide 20
Weak Sauce: No menu options today…
Medium Spice: Make an approximate version of the misleading bar chart on slide 19
Yoga Flame: No menu options today…
Dim Mak: Make an exact replicate of the misleading bar chart on slide 19. I’m looking for perfection!
Cawley, S., Bekiranov, S., Ng, H.H., Kapranov, P., Sekinger, E.A., Kampa, D., Piccolboni, A., Sementchenko, V., Cheng, J., Williams, A.J. and Wheeler, R., 2004. Unbiased mapping of transcription factor binding sites along human chromosomes 21 and 22 points to widespread regulation of noncoding RNAs. Cell, 116(4), pp.499-509.
Doggett, T.J. and Way, C., 2024. Dynamite plots in surgical research over 10 years: a meta-study using machine-learning analysis. Postgraduate Medical Journal, 100(1182), pp.262-266.
Epstein, M.P. and Satten, G.A., 2003. Inference on haplotype effects in case-control studies using unphased genotype data. The American Journal of Human Genetics, 73(6), pp.1316-1329.
Giannarelli, L., Minton, S., Wheaton, L. and Knowles, S., 2023. A Safety Net with 100 Percent Participation: How Much Would Benefits Increase and Poverty Decline?. Washington, DC: The Urban Institute.
Hernández-Yumar, A., Abásolo Alessón, I. and González López-Valcárcel, B., 2019. Economic crisis and obesity in the Canary Islands: an exploratory study through the relationship between body mass index and educational level. BMC Public Health, 19, pp.1-9.
Wittke-Thompson, J.K., Pluzhnikov, A. and Cox, N.J., 2005. Rational inferences about departures from Hardy-Weinberg equilibrium. The American Journal of Human Genetics, 76(6), pp.967-986.